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Abstract.  This  paper  presents  the  derivation  of  a  new  algorithm  for  the  stable  computation  of 
sample  partial  correlation  coefficients. 

,  •  _}Ve  start  with  Bareiss’  algorithm  for  the  solution  of  linear  systems  of  equations  with  (non- 
"gymmetric)  Toeplitz  coefficient  matrix  and  show  how  to  generalize  it  to  matrices  that  are  not 
Toeplitz.  The  so  generalized  Bareiss  algorithm  computes  the  LU  and  UL  factorizations  of  those  ma¬ 
trices  whose  contiguous  principal  submatrices  are  all  non-singular.  For  symmetric  positive-definite 
matrices  A ,  which  naturally  satisfy  this  condition,  the  normalized  version  of  Bareiss’  algorithm  is 
just  the  Hyperbolic  Cholesky  algorithm,  which  computes  the  upper  and  lower  triangular  Cholesky 
factors  U  and  L  of  A  by  means  of  2  X  2  hyperbolic  rotations. 

Guided  by  the  data  flow  graph  of  the  Hyperbolic  Cholesky  algorithm,  we  show  that  there  exists 
one  sequence  of  2  X  2  orthogonal  rotations  that  effects  the  transformation  from  U  to  L  such  that 
the  sines  of  these  rotations  equal  the  hyperbolic  tangents  of  the  hyperbolic  rotations  From  the 
connection  to  the  Hyperbolic  Cholesky  algorithm  it  also  follows  that,  if  A  is  a  sample  covariance 
matrix,  the  sines  are  sample  partial  correlation  coefficients.  <T  j. 

Consequently,  given  a  data  matrix  B  it  is  recommended  to  determine  sample  partial  correla¬ 
tions  directly  from  the  columns  of  B  instead  of  forming  A  =  BT B  and  inviting  potential  loss  of 
numerical  accuracy:  compute  the  QR  decomposition  B  =  QU;  apply  plane  rotations  to  transform 
V  to  L\  the  sines  of  the  rotation  angles  are  partial  correlations. 


From  Bareiss’  Algorithm  to  the  Stable 
Computation  of  Partial  Correlations 


Jean-Marc  Delosme1,  Ilse  C.F.  Ipsen2 

Research  Report  YALEU/DCS/RR-624 
May  1988 


DTIC 

ELECTEI 


AUG  0  21988 


D 


1  Department  of  Electrical  Engineering,  Yale  University 

2  Department  of  Computer  Science,  Yale  University 

The  work  presented  in  this  paper  was  supported  by  the  Office  of  Naval  Research  under  con¬ 
tracts  N000014-86-K-0310  and  N00014-85-K-0461,  and  by  the  Army  Research  Office  under  contract 
DAAL03-86-K-0158.  Approved  for  public  release:  distribution  is  unlimited. 


1.  Introduction 

In  [6,  7]  a  new  algorithm  was  introduced  for  the  stable  computation  of  sample  partial  correlation 
coefficients:  the  partial  correlations  are  computed  directly  from  the  data  matrix  so  as  to  avoid  the 
loss  of  numerical  accuracy  typically  associated  with  the  formation  of  the  sample  covariance  matrix 
(numerical  examples  are  given  in  [6,  7]).  This  paper  recounts  the  context  that  led  to  the  discovery 
of  this  result  and  presents  derivations  of  the  algorithm  from  multiple  vantage  points. 

In  Section  2  we  start  with  Bareiss’  algorithm  for  the  solution  of  linear  systems  of  equations 
with  (non-symmetric)  Toeplitz  coefficient  matrix  [3]  and  show  how  to  generalize  it  to  matrices  that 
are  not  Toeplitz.  The  so  generalized  Bareiss  algorithm  computes  the  LU  and  UL  factorizations  of 
those  matrices  whose  contiguous  principal  submatrices  are  all  non-singular.  For  symmetric  positive- 
definite  matrices  A,  which  naturally  satisfy  this  condition,  Section  3  shows  that  the  normalized 
version  of  Bareiss’  algorithm  is  just  the  Hyperbolic  Cholesky  algorithm  [5],  which  computes  the 
upper  and  lower  triangular  Cholesky  factors  U  and  L  of  A  by  means  of  2  x  2  hyperbolic  rotations. 

Guided  by  the  data  flow  graph  of  the  Hyperbolic  Cholesky  algorithm,  we  show  that  there  exists 
one  sequence  of  2  X  2  orthogonal  rotations  that  effects  the  transformation  from  U  to  L  such  that 
the  sines  of  these  rotations  equal  the  hyperbolic  tangents  of  the  hyperbolic  rotations.  From  the 
connection  to  the  Hyperbolic  Cholesky  algorithm  it  also  follows  that,  if  A  is  a  sample  covariance 
matrix,  the  sines  are  sample  partial  correlation  coefficients.  Section  paralg  provides  both,  algebraic 
and  a  geometric  derivations  of  this  result. 

Since,  given  a  data  matrix  B,  sample  partial  correlations  can  be  determined  in  a  stable  way 
from  a  Cholesky  factor  of  A  =  BT B  it  is  recommended  to  determine  them  directly  from  the 
columns  of  B  instead  of  forming  A  and  inviting  potential  loss  of  numerical  accuracy:  compute 
the  QR  decomposition  B  =  QU\  apply  plane  rotations  to  transform  U  to  L;  the  sines  of  the 
rotation  angles  are  partial  correlations.  More  specifically,  the  transformation  of  U  to  L  yields  all 
partial  correlations  between  variables  Bi  and  given  the  variables  inbetween:  5t+1  . . .  B,+k~i- 

Section  arbpar  concludes  the  paper  with  some  remarks  on  how  to  go  about  computing  sample 
partial  correlations  given  arbitrary  sets  of  variables. 


2.  The  Generalized  Bareiss  Algorithm 

In  [3]  E.  Bareiss  proposed  an  algorithm  to  solve  the  linear  system 

Ax  =  b, 

where  A  is  a  n  x  n  Toeplitz  matrix;  A  does  not  have  to  be  symmetric. 

2.1.  Description  of  the  Algorithm 

The  algorithm  reduces  A  to  triangular  form,  modifies  the  right-hand  side  6  accordingly,  and 
determines  x  by  backsubstitution.  The  reduction  process  operates  on  the  2 n  x  n  array 


CO- 


and  Bareiss  proposes  two  versions  of  this  process  (see  Sections  2  and  3  in  [3]).  We  will  generalize 
the  second  version  that  treats  the  upper  and  lower  halves  of  the  array  symmetrically,  and  merely 
ignore  the  condition  that  A  be  Toeplitz. 

The  reduction  process  proceeds  by  removing  successive  superdiagonals  in  the  upper  half  of  the 
array  and  successive  subdiagonals  in  the  lower  half  of  the  array.  At  completion,  the  array  is  of  the 
form 
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where  L  is  a  lower  triangular  matrix  and  U  is  an  upper  triangular  matrix. 

.  Before  providing  a  formal  description  of  the  generalized  Bareiss  algorithm,  we  will  illustrate 

the  algorithm  by  considering  the  case  n  =  4.  The  initial  array  is  of  the  form 
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The  stars  indicate  the  elements  to  be  modified  in  step  1,  the  middle  two  rows  remain  unchanged. 
The  reduction  process  proceeds  as  follows. 


1.  Linear  combinations  of  row  i  in  the  upper  half  of  the  array  with  row  *  +  1  in  the  lower  half  of 
the  array  eliminate  element  (i,  i  +  1)  in  the  upper  half  and  element  (i  +  1,  »)  in  the  lower  half, 
1  <  i  <  3: 
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Super-  or  subdiagonal  elements  doomed  for  elimination  are  represented  by  s,  and  diagonal  ele¬ 
ments  participating  in  the  elimination  by  d.  Appropriate  multiples  for  the  linear  combinations 
are  made  up  from  the  ratios  s/d. 

2.  Linear  combinations  of  row  i  in  the  upper  half  of  the  array  with  row  i  +  2  in  the  lower  half  of 
the  array  eliminate  element  (t,  t  +  2)  in  the  upper  half  and  element  (*  +  2,  i)  in  the  lower  half, 
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Note  that  the  two  rows  responsible  for  an  elimination  contain  zeros  in  corresponding  positions, 
hence  no  previously  introduced  zeros  are  destroyed. 

3.  Linear  combinations  of  row  1  in  the  upper  half  of  the  array  with  row  4  in  the  lower  half  of  the 
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array  eliminate  element  (1,4)  in  the  upper  half  and  element  (4, 1)  in  the  lower  half: 
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The  matrices  L  and  U  obtained  through  this  process  will  be  shown  in  Section  usefac  to  be  the 
right  factors  in  the  UL  and  LU  decompositions  of  A.  The  total  number  of  operations  to  find  both 
decompositions  comes  to  2  (n  ~  *)  =  n(n- 1)  divisions  and  2  ^"Jj1  (n  —  *)2  =  \n(n— j)(n- 1) 
multiplications;  this  is  twice  the  number  of  operations  for  Gaussian  elimination  without  pivoting, 
which  determines  only  one  of  the  decompositions. 

In  the  formal  description  of  the  generalized  Bareiss  algorithm  let,  at  the  start  of  step  k,  a-*-1* 
denote  row  i  in  the  upper  half  of  the  array,  and  o-~fc+1'  row  t  in  the  lower  half  of  the  array.  Thus 
at-+0*  =  a--0^  =  a,  equals  the  ith  row  of  A.  The  jfth  element  in  row  a-*-1*  is  denoted  by 
and  the  jth  element  in  row  aj-fc+1)  by  a-~i+1^.  The  reduction  process  may  now  be  expressed  as: 


Generalized  Bareiss  Algorithm 


1  <  t  <  n, 


for  k  =  1  to  n  —  1, 


1  <  i  <  n  —  fc, 
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Note,  that  we  attempt  to  distinguish  parallelism  in  the  algorithms  by  employing  two  different 
notations  for  repetitive  operations:  for-loops,  such  as  the  fc-loop  above,  increase  in  unit  increments 
and  iterations  are  performed  one  after  the  other;  ranges,  like  1  <  i  <  n  —  k,  denote  independent 
iterations  that  can  all  be  performed  in  parallel  for  a  given  k. 

2.2.  UL  and  LU  Matrix  Factorizations 

In  order  to  show  that  the  algorithm  computes  the  right  factors  in  the  UL  and  LU  decomposi¬ 
tions  of  the  matrix  A  it  is  convenient  to  regroup,  for  each  step  k,  all  the  operations  performed  on 
pairs  of  rows  into  a  single  matrix  operation.  Let  R(k)  denote  the  rectangle  at  the  start  of  step  k, 
k  >  1,  that  consists  of  the  ‘active’  rows  1 .  ..n  —  k  in  the  upper  half  of  the  array.  Similarly,  R(~k) 
denotes  the  rectangle  at  the  start  of  step  k  that  consists  of  the  active  rows  k  +  1 ...  n  in  the  lower 
half  of  the  array. 


I 


There  are  two  main  diagonals  in  a  rectangular  matrix:  one  starting  from  the  top  left  corner, 
and  the  second  ending  at  the  bottom  right  corner.  As  k  increases,  the  leftmost  (top)  main  diagonal 
in  R.W  and  the  rightmost  (bottom)  main  diagonal  in  R(~k)  shrink  in  place,  they  are  denoted  by 
D W  and  D^~k\  respectively  (cf.  the  d  elements  in  the  example).  These  diagonals  are  also  portions 
of  the  main  diagonals  in  the  2 n  X  n  array.  The  diagonal  matrices  associated  with  the  two  remaining 
main  diagonals  in  RW  and  R^~k^  are  denoted  S ^  and  respectively  (cf.  the  s  elements  in 

the  example).  Within  the  array  these  diagonals  are  portions  of  the  fcth  superdiagonal  and  of  the 
Arth  subdiagonal,  respectively.  The  multipliers  in  the  reduction  process  are  the  respective  ratios  of 
the  diagonal  elements  of  S ^  and  (i.e.  the  elements  to  be  eliminated  in  R ^  and  R(~k>)  and 

the  diagonal  elements  of  B^~k^  and  D M  (the  pivots).  Thus,  at  the  start  of  step  k,  k  >  1, 
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Then,  assuming  and  D(~k^  to  be  non-singular,  step  k  consists  of  replacing  the  rectangles 
(RW\  .  (  In-k  -SW(D(-k))-'\(  *<*>  \ 

We  have  to  show  that  the  product  of  the  multiplier  matrices  can  be  appropriately  partitioned 
into  upper  and  lower  triangular  matrices.  In  order  to  examine  the  band  structure  of  matrices  we 
distinguish  the  two  outermost  non-zero  diagonals  of  a  matrix:  Let  the  fcth  diagonal  of  a  matrix  M, 
consisting  of  elements  have  index  k  where  k  ranges  from  —  (n  —  1)  ton-1.  The  valuation  of 

a  matrix  M  is  defined  as  the  smallest  index  v(M)  of  a  non-zero  diagonal  in  M ;  the  degree  of  M  is 
the  highest  index  S(M)  of  a  non-zero  diagonal  in  M.  Denote  by  B„,s  the  class  of  nxn  matrices  with 
v(M)  >  v  and  6(M)  <  6.  For  instance,  Bo,n-i  is  the  class  of  upper  triangular  matrices,  B_(n_i)i0 
the  class  of  lower  triangular  matrices,  and  Bi,n_i  (B_(n_i)>_1)  the  class  of  strictly  upper  (lower) 
triangular  matrices. 

From  the  multiplication  and  addition  of  band  matrices  one  knows:  if  Mi  G  B^,  and  M2  G 
B„,,s,  then  the  product  M1M2  G  BUl and  the  sum  A/i  +  M2  G  When 

we  just  want  to  emphasize  the  band  structure  of  a  matrix  M  in  the  class  B„,i,  we  write  MUis  in 
place  of  A/,  Thus,  M^^^Mi^^  ~  -'fyj -f ^ ,-5i  -4-62  s-nd  M^^i  d-  Af^(52  ~  *^Luin{ } ,max{^i ,6? } •  Df 
course  cancellations  could  occur  and  the  indices  u  and  S  in  MVts  may  be  smaller  than  the  valuation 
of  M  and  larger  than  its  degree,  respectively. 

The  transformation  performed  on  the  2 nxn  array  at  step  k  is  a  premultiplication  by  the 
2 n  x  2 n  multiplier  matrix 
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where  —S^k\D^  *))-1  and  — are  diagonal  of  order  n  —  k.  Hence  this  transformation 
has  the  structure 


(  M),o  Mk,k  \ 

\  M-k,-k  Mo, o  )  ' 


It  will  now  be  shown  by  induction  that  the  product  of  the  transformations  from  step  1  to 
step  k,  1  <  k  <  n  —  1,  is  of  the  form 


(  Mu-i  M\,k  \ 

\  M-k,- i  M-(k-i),o  )  ' 


The  above  representation  is  obviously  true  for  k  =  1.  Assume  now  that  2  <  k  <  n  —  1  and  that  the 
product  of  the  transformations  from  step  1  to  step  A:  —  1  is 


/  Mu— 2  M\,k-\  A 

i),_i  M.{k-2)fl)  ‘ 


The  product  of  the  transformations  from  step  1  to  step  k  is  then 
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Consequently,  the  product  of  the  transformations  from  step  1  to  step  n  is  of  the  form 


(U+  U#\ 

U#  U)' 


where  U+  and  Z+  are  respective  n  X  n  upper  and  lower  triangular  matrices,  and  U#  and  Z#  are 
respective  n  x  n  strictly  upper  and  lower  triangular  matrices.  A  slightly  more  refined  induction 
proof  exploiting  the  fact  that  the  Mq,q  submatrices  in  the  fcth  transformation  are  actually  n  x  n 
identity  matrices  shows  that  the  product  of  transformations  from  step  1  to  step  k  has  the  form 


/  In  +  M\,k-l  Mi,k  ^ 

\  M-k,-l  In  +  M_(k-\),-\  )  ’ 


so  that  U+  and  Z+  have  unit  diagonal. 

Thus,  the  generalized  Bareiss  algorithm  effects 


(U+  U#\(A\  =  ((U+  +  U#)A\  =  [L\ 

U#  l+J\aJ  \(L+  +  L#)aJ  ~  [uj  ’ 


where  L  is  n  x  n  lower  triangular  and  V  n  x  n  upper  triangular.  Since  U+  +  U$  and  Z+  +  Z#  are 
respectively  unit  lower  and  upper  triangular  matrices,  the  matrices  Z  and  U  constitute  the  right 
factors  in  the  LU  and  UL  decompositions  of  A: 


A  =  UL ,  6  =  (U+  +  I/#)"1 
A  =  LU,  L  =  (L+  +  l#)~l 
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2.3.  Conditions  for  Successful  Factorization 

If  the  matrices  D W  and  D^-~k\  1  <  fc  <  n  —  1,  are  non-singular  then  the  generalized  Bareiss 
algorithm  runs  to  completion  and  delivers  LU  and  UL  decompositions  of  A  (a  necessary  and  suf¬ 
ficient  condition  can  be  derived  assuming  the  ratio  s/d  is  set  to  zero  whenever  s  —  d  =  0).  The 
matrices  D W  and  D^~k^  are  non-singular  when  their  diagonal  elements 
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are  all  non-zero.  We  shall  now  interpret  this  condition  in  terms  of  the  original  matrix  A. 
The  contiguous  principal  submatrix 
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of  order  k  —  1  of  A  is  called  the  ( k  -  l)-block  after  (row)  i  in  A  or  the  (k  —  l)-blork  before  (row)  i  +  k 
in  A.  The  row  vectors  a\k~^  and  a-~£+1^  in  the  generalized  Bareiss  algorithm  at  step  k  can  be 
expressed  in  terms  of  the  inverse  of  the  ( k  —  l)-block  after  i  in  .4: 
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Thus  ap  ^  represents  the  tth  row  of  the  Schur  complement  of  the  (k  —  l)-block  after  i  in  A,  and 

a!+fc+1)  the  (*  +  fc)th  row.  The  Schur  complement,  as  defined  as  by  Ando  [2],  is  a  n  x  n  matrix  with 
rows  i  ...  i  +  k  equal  to  0  and  columns  i  ...  i  +  k  equal  to  0. 

The  two  equations  above  constitute  a  direct  consequence  of  the  quotient  property  of  Schur 
complements  (Theorem  3  in  [2]).  Indeed  the  property  is  true  for  k  =  1,  since  it  states  a|+0*  = 
a--0*  =  a,-.  Assuming  it  is  true  for  the  vectors  ajfc-1^  and  a[+*+1\  the  right-hand  side  in  Bareiss’ 
first  recurrence  relation 

aW  _  Jk-i)  _  (k-i),  l-k+i)  r,  t-k+i) 
ai  ~  ai  ai,i+k  Vai+fc,i+fc/  °t+fc 

is  row  i  of  the  Schur  complement  of  the  1-block  after  i  +  k  —  1  in  the  Schur  complement  of  the 
( k  —  l)-block  after  i  in  A  and,  from  the  quotient  property,  this  is  row  i  of  the  Schur  complement  of 
the  fc-block  after  i  in  A: 


/  L\ 

ai  =  ai  —  (  aM+l  •  •  •  ai,i+k  ) 


®t+l,i+Jfc 


®i  +  fc,  t  +  1 


®t+fc,t+k 


Similarly  the  right-hand  side  in  the  second  recurrence  relation 

a<-fc)  =  a(-fc+l!  -  a<-fc+1)(a<*-1))-1a^-1) 

t+k  i+k  «+<:,»  V  t,»  >  1 

must  be  row  i  -f  k  of  the  Schur  complement  of  the  fc-block  before  t  +  k  in  A: 


a.'+A?  =  a«+*  ~  (  ai+k,i  ■  ■  •  ai+k,i+k-i  ) 


Ot.t+k-1 


ai+jfe-l, «■+*-! 


&i+k- 1 


The  interpretation  of  the  intermediate  quantities  in  the  algorithm  is  now  clear:  denotes  the 

jth  element  of  row  t  in  the  Schur  complement  of  the  principal  submatrix  of  order  k  just  after  row  i 
in  A ,  similarly  a-”*'  is  the  jth  element  of  row  i  in  the  Schur  complement  of  the  principal  submatrix 
of  order  k  just  before  row  i  in  A. 

Clearly  the  condition  that 


*11  >  “22  »  an—k,n—k  * 


1 <  k < n-  1 


all  be  non-zero  is  equivalent  to  the  condition  that  the  products 

n ai i  *1'  n ai+/i +i*  •  ■  n 1<  k  <n  -  1 
j=i  i= i  i=i 

all  be  non-zero.  Likewise, 

fc+l)  A— fc+l)  nl-fc+l)  1  <  Jb  <  n  —  1 

aifc+l,fc+l’  afc+2,/fc+2»  an,n  '  I  S  K  S  »  t 

are  non-zero  if  and  only  if 

n ai+>4+j’  n4^.  •••’  i  <  ^  < « - 1 

J=1  i=i  >= l 


are  non-zero.  The  quotient  property  and  Schur’s  determinantal  formula  (Section  4  in  [4])  imply  that 
the  above  products  are  equal  to  the  respective  determinants  of  the  (k  —  l)-blocks  after  1,  2,  , . . . ,  n— 
k,  and  before  fc+1,  fc  +  2,  ...,  n  for  i  <  &  <  n  —  1.  If  the  determinants  of  the  first  (second)  set  of 
blocks  are  non-zero  then  all  contiguous  principal  submatrices  of  orders  1  ...  n  —  1  of  the  leading 
(trailing)  principal  submatrix  of  A  of  order  n  —  1  are  non-singular,  hence  all  contiguous  principal 
submatrices  of  A  of  orders  1  ...  n  -  1  are  non-singular.  Note  that  this  condition  is  significantly 
more  stringent  than  if  Gaussian  elimination  without  pivoting  were  applied  twice  to  compute  U 
and  L,  then  only  all  leading  and  trailing  principal  submatrices  of  orders  1  ...  n  —  1  should  be 
non-singular. 


3.  The  Hyperbolic  Cholesky  Algorithm 

From  now  on,  only  symmetric  positive-definite  matrices  A  will  be  considered.  Let  L  and  U 
be  the  respective  lower  and  upper  triangular  Cholesky  factors  of  A:  /I  =  LT L  =  UTU.  Since  the 
generalized  Bareiss  algorithm  computes  the  UL  and  LU  decompositions  of  A  with  unit  triangular 
left  factors,  the  algorithm  could  be  applied  to  obtain  the  Cholesky  factors  L  and  U  of  A  by  simply 
scaling  every  row  in  the  right  factors  by  the  square-root  of  the  corresponding  diagonal  element. 

3.1.  Derivation  of  the  Hyperbolic  Cholesky  Algorithm 

A  more  balanced  alternative,  presented  in  the  theorem  below,  is  to  modify  the  generalized 
Bareiss  algorithm  and  scale  intermediate  quantities  in  order  to  obtain  the  Cholesky  factors  directly. 
Note  that  the  theorem  and  its  proof  remain  true  for  symmetric  positive-definite  matrices  A  with 
block  entries  if  transposes  are  inserted  in  the  right  places. 


Theorem  3.1.  If  t>'+0*  =  =  (a*,)  1  <  i  <  n,  and 

for  1  <  k  <  n,  1  <  i  <  n  -  k,  p,,I+fc  = 

~*r) {:&>)■ 

Proof.  The  proof  is  by  induction.  First  note  that  since  A  is  positive-definite  o„  is  strictly  positive, 
the  initialization  can  be  performed,  and  the  theorem  holds  for  k  =  0.  Suppose  it  also  holds  for 
k  >  0.  From  the  generalized  Bareiss  algorithm  and  the  induction  hypothesis 

a(fc+i)  _  a(*)  _  (*)  (Q(-k) 

a>  ~  ai  ai,i+k+i\ai+k+i,i+k+i)  “i+fc+i 
-  («!,*’)*  [»!*'  - 

=  («!*’)*  I1’,1*1  -  . 

Now,  from  the  quotient  property 

a(*+i)  _  a(*)  _  a(*)  (A~k)  ri_M) 

“u  ~  att  a«,t+fc+llai+Jk+l,i+k+ll  ai+k+l,. 

=  («!f¥a  -  p?..+k+i)(4fc))^ 

Since  A  is  positive-definite  and,  from  the  expression  of  the  determinant  of  a  contiguous  principal 
submatrix  as  a  product  in  Section  2.3,  a|,fc+1^  is  the  ratio  of  the  determinant  of  the  ( k  +  2)-block 
after  row  i  -  1  to  the  determinant  of  the  ( k  +  l)-block  after  row  i  it  follows  that  a-*+1*  >  0, 
\Pi,i+k+l  I  ^  1,  and 

(^  =  (a\i+1))Hl-pl+k+i)-^ 

Consequently, 

ajfe+1>  =  («|f+1))Mt+1). 

The  generalized  Bareiss  algorithm  and  the  induction  hypothesis  also  imply 


a(-*-D  -  o("fc)  -a{~k)  (a{k)rla{k) 

°i+k+l  ~  fli+fc+ 1  °»+fc+l,i'a«  )  ai 

~  \ai+k+l,i+k+l>  ["i+fc+l  'ai+k+l.j+fc+ll  a>'+k+l.»'.ai«  >  Vi 

t  (-*)  a  r  i-fc)  (k) 

—  (ai+fe+l,«+fc+l)2  [^i+fc+l  P«,i+fc+lu,  ) 


where  the  last  statement  is  obtained  by  observing  that  aj+£^j  f-  =  a-*+fc+1  for  a  symmetric  matrix. 
Making  again  use  of  the  quotient  property 

(ai+k+l,i+fc+l)2  =  (a!+fc+l,l+M-l)J(1  -  Pi.i+fc+l)  2- 


A-k-l)  _  ,„(-*-!) 


AJ-k- 1) 


t+i+l,t+k+l  >  ui+k+ 1  • 


L CKK2 


mmmm 


mmm 


Since  the  operations  applied  to  pairs  of  rows  are  hyperbolic  rotations,  the  algorithm  has  been 
called  the  Hyperbolic  Cholesky  algorithm  in  [5]: 


Hyperbolic  Cholesky  Algorithm 


1  <  i  <  n, 


I 

„(-o) 


_  ( (<*«) 

\{au)~^ai) 


for  k  =  1  to  n  —  1, 

!<»<»-*,  Pi,i+k  =  K.i+k'W+k.i+k 


-  r1 

(vi+k,i+k) 


ifa) " (1  ■  (->L  (£**■> 


The  rows  of  the  Cholesky  factors  are  given  by 


L  = 


vn\ 

\  ) 


V  («&>)-*-?>  ) 


u  = 


/  BJ-°)  \  /  (oi70>)-*al-°>  \ 

4_1)  = 

ui“;+1)/  V(aL;n+1)r^L'n+l)y 


This  algorithm  differs  slightly  from  the  one  presented  in  [5]  because  it  computes  both  L  and  U 
factors.  The  algorithm  in  [5]  determines  only  U  (or  only  L),  and  takes  advantage  of  the  property 
that  the  2x2  hyperbolic  rotation  matrices  may  be  computed  from  the  upper  triangular  (lower 
triangular)  part  of  each  row.  This  property  follows  from  the  symmetry  of  A;  when  applied  to 
a  non-symmetric  matrix  the  generalized  Bareiss  algorithm  cannot  be  decomposed  into  two  such 
independent  parts. 

The  operation  count  for  computing  both  Cholesky  factors  is  |n(n  -  1)  square  roots,  n(n  -  1) 
divisions  and  about  |n3  multiplications;  this  is  four  times  the  number  of  operations  of  the  Cholesky 
algorithm.  However  the  number  of  hyperbolic  rotations  is  the  same  as  the  number  of  multiplica¬ 
tions  in  the  Cholesky  algorithm  and,  since  specialized  hardware  may  be  devised  that  determines 
and  applies  hyperbolic  rotations  as  fast  (on  about  twice  the  area)  as  sequential  multipliers,  the  Hy¬ 
perbolic  Cholesky  algorithm  may  in  some  instances  be  competitive  with  the  Cholesky  algorithm. 
In  the  context  of  parallel  computation  on  systolic  arrays,  with  nearest  neighbor  processor  intercon¬ 
nections  and  no  broadcasting,  Hyperbolic  Cholesky  computes  a  single  Cholesky  factor  faster  than 
the  Cholesky  algorithm.  Hence  the  more  complex  processors  bring  about  an  advantage  in  speed 
[8], 

3.2.  Relation  between  Upper  and  Lower  Triangular  Cholesky  Factor 

With  regard  to  parallel  computation,  data  flow  graphs  have  become  a  popular  tool  to  visualize 
and  analyze  the  partial  order  of  computations  in  an  algorithm.  For  instance,  the  graph  in  Figure  1 
depicts  the  partial  order  of  computations  in  the  Hyperbolic  Cholesky  algorithm  for  n  =  4;  hyperbolic 
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(an)  J<*i 


(a 22)  5<*2 


(033)  2a3 


(a44)  2cl4 


Figure  1:  Data  Flow  Graph  for  the  Hyperbolic  Cholesky 
Algorithm,  n  =  4. 


rotations  whose  tangents  Pi,i+k  have  the  same  index  k  belong  to  the  same  column,  thus  clearly 
showing  that  they  are  independent  and  can  be  determined  in  parallel. 

The  graph  can  also  be  viewed  as  the  realization  of  a  linear  transformation  in  terms  of  elementary 
operations  (hyperbolic  rotations);  the  transformation  takes  as  input  from  the  right  the  scaled  rows 
(a,,)- 3  a;  of  A,  and  delivers  the  rows  of  U  as  outputs  at  the  top  and  the  rows  of  L  as  outputs 
at  the  bottom.  Each  box  in  Figure  1  can  be  viewed  as  a  linear  four-terminal  element,  with  two 
inputs  from  the  right  and  two  outputs  to  the  left.  This  interpretation  illustrates  the  usefulness 
of  the  data  flow  graph  not  only  for  parallel  algorithm  analysis  but  also  for  parallel  algorithm 
synthesis.  Since  U  and  L  are  Cholesky  factors  of  the  same  matrix  A  they  must  be  related  by  a 
n  x  n  orthogonal  transformation,  and  the  graph  can  help  us  to  discover  a  decomposition  of  this 
orthogonal  transformation  in  terms  of  elementary,  2x2,  orthogonal  rotations.  We  simply  exchange, 
for  every  element,  the  roles  of  the  bottom  right  input  and  the  top  left  output,  which  now  become 
respectively  output  and  input.  The  corresponding  data  flow  graph  is  displayed  in  Figure  2;  the 
effect  of  this  exchange  on  the  parameters  of  the  four-terminal  elements  is  made  explicit  in  the 
theorem  below.  Note  that  the  intermediate  quantities  on  the  edges  in  Figures  1  and  2  are  identical. 


‘V 


Figure  2:  Data  Flow  Graph  for  the  Cholesky  Factor  Inter¬ 
change  Algorithm,  n  =  4. 


Theorem  3.2.  The  following  algorithm  transforms  the  upper  triangular  Cholesky  factor 


U  = 


(  4'0)  \ 


U -n+1)/ 


of  A  to  its  lower  triangular  Cholesky  factor  L: 

Cholesky  Factor  Interchange  Algorithm 

for  2t  +  k  =  3  to  2n  -  1, 

1  <  i  <  n,  k  >  1,  tu+k  = 


J\ 

Vi+k  , 


T  T 

Proof.  If  two  vectors  (  v'  w' )  and  (  v  w )  are  related  by  a  hyperbolic  rotation 

(:-)=  (x  ?)(:)•  4-1-* 


n 


th  =  Sh/Ch- 


then  the  vectors  (v1  w)  and  (  v  w')  are  related  by 

.-i 


(HS  ?)(;)■ 

Since  cj"1  =  (1  — 1\)^ ,  the  2x2  matrix  above  is  an  orthogonal  rotation 

(0-0  ?)(;)•  '-*-**■ 


Thus,  if  we  have 


from  the  Hyperbolic  Cholesky  algorithm  then 

„<*> 


,i+k  $i,i+k 
i+k  Ci,i+k 


where  the  elements  of  the  rotation  satisfy  =  Pi,i+k  and  Citi+k  =  (1  -  p2lx+k)%.  This  relation 
does  not  yet  provide  us  with  the  complete  algorithm:  p;,,+*  is  equal  to  vj  i+^(ri+fcT+*)’"1  but 
this  relationship  cannot  be  employed  directly  since  4+fc,i+l  is  not  available.  However,  the  Schur 
complement  interpretation  of  v\k^  indicates  that  =  0  hence  the  rotation  satisfies 


i,i+k  ~si,i+k 
:,i+k  Ci,i+k 


,(*-1) 

M+fc 

i+k,i+k 


from  which  it  directly  follows  that  cti;+fc  =  (1  +  tfi+k )  ^  and  siti+k  =  Cij+kU,i+k  with  U,i+k  = 

Since  at  each  step  the  superscript  of  every  row  being  updated  is  incremented 
by  1,  the  rotations  can  clearly  be  executed  in  the  specified  order  and  the  result  matrix  equals 

Mn-l)\ 

4-1 

V  v<°>  I 

But  this  is  just  the  lower  triangular  Cholesky  factor  L  of  A  as  produced  by  the  Hyperbolic  Cholesky 
algorithm. 


4.  Application  to  the  Computation  of  Sample  Partial  Correlations 

Now  consider  the  case  where  the  n  x  n  symmetric  positive-definite  matrix  A  represents  the 
sample  covariance  matrix  of  a  m  x  n  zero-mean  data  matrix  B:  A  =  (m  -  1  )-1Z?Tf?  [1]  (without 
loss  of  generality,  the  scaling  factor  (m  —  l)-1  is  omitted  in  the  sequel).  The  m  x  n  matrix 


bn) 
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contains  a  zero-mean  column  vector  6;  for  each  of  n  real  random  variables  B,,  1  <  i  <  n.  Now 
the  results  of  the  previous  section  are  employed  to  derive  an  algorithm  for  the  computation  of 
certain  sample  partial  correlations  of  B  that  does  not  require  formation  of  BT  B.  We  will  give 
both,  algebraic  and  geometric  derivations  of  the  algorithm. 

4.1.  Algebraic  Derivation 

From  Theorem  3.1  we  know  that  pi,i+i  satisfies 


Pi.,+1  =  =  (a»)  ^aM+i(a«+i,i+i)  l<i<n-l. 

Since  A  =  BT B,  Pi,i+i  represents  the  sample  correlation  coefficient  between  Bi  and  Bi+ 1  [l].  Fur¬ 
thermore, 


(i) 

Pi.i+2  —  vi,i+ 2 


Vut+2,i+2l 


(41})  M.V+2 


(a 


(-1) 
i+2,i+2 


H, 


1  <  i  <  n  —  2. 


Hence,  pij+2  represents  the  sample  partial  correlation  coefficient  between  Bi  and  Bi+ 2  holding  B{+\ 
fixed  [l].  In  general, 


Pi,i+k  =  (Vi+k,i+D 


1  <  k  <  n  -  1, 


is  the  sample  partial  correlation  coefficient  between  B,  and  Bl+k  holding  the  variables  inbetween, 
Bi+1  ...  Bi+k-!,  fixed  [1]. 

The  upper  triangular  matrix  U  in  the  QR  factorization  B  =  QU  of  B  (where  Q  has  orthonor¬ 
mal  columns)  is  also  the  upper  triangular  Cholesky  factor  of  A  s.nce  A  —  UTQTQU  =  UTU. 
Theorem  3.2  asserts  that,  in  the  transformation  from  U  to  L,  the  sines  o;  the  rotations  are  equal 
to  the  partial  correlations. 

Accordingly,  the  following  algorithm  computes  the  partial  correlations  ptll+k  directly  from  the 
data  matrix  B: 

1.  Compute  the  QR  decomposition  B  =  QU  of  B,  where  Q  has  orthonormal  columns  and  U  is 
upper  triangular  with  positive  diagonal  elements. 

2.  Use  the  Cholesky  Factor  Interchange  algorithm  to  transform  U  to  L. 

3.  The  sine  of  the  rotation  eliminating  element  ( i,  i+k)  is  the  sample  partial  correlation  coefficient 

Pi,i+k  between  Bi  and  holding  the  variables  inbetween,  Bi+i  •  ■  ■  ,  fixed. 

As  shown  in  [6,  7]  this  algorithm  avoids  the  loss  of  numerical  accuracy  associated  with  the  formation 
of  BTB. 

4.2.  Geometric  Derivation 

Because  a  purely  algebraic  interpretation  does  not  seem  to  be  very  insightful,  we  offer  a  geo¬ 
metric  derivation  of  Theorem  3.2,  starting  with  two  simple  examples. 

First  consider  the  2x2  case.  Let 


U  =  (  u“  Ul2) 

\  0  U22J 

be  the  upper  triangular  matrix  (with  positive  diagonal  elements)  in  the  QR  decomposition  of  the 
m  X  2  matrix  B.  Since  atJ  =  bjb:  we  know  from  Theorem  3.1  that  the  sample  correlation  coefficient 
equals 


Pi,i+\  =  ibjbi)-^bjbi+l)(bj+1bl+^ 


C2 


Figure  3:  Angles  in  the  2x2  Example. 

and  thus  represents  the  cosine  of  the  angle  between  Bi  and  £,+i .  So,  the  sample  correlation  pj2 
between  B\  and  B2  is  the  cosine  of  the  angle  012  between  the  two  columns  of  U .  Because  of  the 
triangular  structure  of  U  its  first  column,  ( un  0  )T,  is  a  positive  multiple  of  the  first  canonical 
vector  ei  =  ( 1  0  )T  while  its  second  column  is  a  linear  combination  of  ej  and  the  second  canonical 

vector  e2  =  (0  1  )T . 

The  columns  of  the  matrix  V  may  be  rotated  in  such  a  way  that  the  second  column  becomes 
a  positive  multiple  of  e2  thereby  turning  the  first  into  a  linear  combination  of  ei  and  e2: 

D- 

The  angle  between  e\  and  e2,  denoted  by  Z(ej,e2),  is  +jt/2  and,  using  the  same  orientation  con¬ 
vention,  the  angle  between  the  two  columns  is  denoted  by  0\2  =  Z(ux,u2).  The  fact  that  the  first 
column  is  a  positive  multiple  of  ei  implies  L{e j,u2)  =  0\2-  To  turn  the  second  column  into  a 
positive  multiple  of  e2  requires  that  all  columns  of  V  be  rotated  by  the  angle 

Z(u2,e2)  =  Z(ei,e2)  -  Z(ei,w2)  =  tt/2  -  0i2, 


see  Figure  3.  Since  the  angle  between  the  two  columns  of  U  is  preserved  under  the  rotation,  and 
the  angle  of  such  a  rotation 


completes  0\2  to  a  right  angle: 


s  =  sin(?r/2  -  0\2)  =  cos012  =  p12. 

Consequently,  the  desired  sample  correlation  is  the  sine  of  the  rotation  0. 
Let  us  take  a  brief  look  at  the  3  x  3  case 

(«n  «i2  u13\ 

0  U22  U2 3  |  . 

0  0  U33/ 
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At  first,  because  the  second  column  has  only  one  more  non-zero  element  than  the  first,  the  columns 
of  U  can  be  rotated  in  the  (e\,  e2)-plane  so  as  to  make  the  second  column  co-linear  with  e2, 


(Cl2  -512  0\  /tin  «12  tli3  \  /*  0  *  \ 

S12  C12  Olio  tl22  U23  I  =  I  *  *  *  1  , 

0  0  1/  \  0  0  U33  /  \0  0  t»33  / 


and  p\2  —  Si2-  Here,  ‘co-linear’  is  used  to  mean  ‘a  positive  multiple  of  and  *  denotes  terms  that 
are  non-zero  in  general. 

In  order  to  provide  the  geometric  meaning  of  the  partial  correlations  we  appeal  again  to 
Theorem  3.1: 


(i)  <  (-1) 

Pi,i+ 2  =  Vi,i+2(Vi+2’i+2 


2 


(a 


(-1) 

»+2,«+2 


r*. 


1  <  t  <  n  -  2, 


where 


=  a«,<+ 2  -  a«,i+i(««+i,«+i)  1®i+i,i+2  =  (bfbi+i)  -  {bj bi+i)(bj+1bi+i)  1(bJ+1bi+2) 

=  bf[Im  -  6,+i(if+16t+i)-16f+1]6t+2. 

The  center  matrix 

Im  ~  bi+l(bj+1bi+l)-lbj+l 

is  an  orthogonal  projector  onto  the  subspace  orthogonal  to  6t+i  and  as  such  symmetric  and  idem- 
potent,  so  that 


4%  2  =  bj(im  -  bi+l(bT+,bl+1ribj+1f{im  -  bi+1(bf+1bi+1ribj+l)bi+2  =  b^btf, 


where 

bV  =  6,-  -  6,+i(bf+i6,+i)-1(6f+i6<),  =  bi+ 2  -  bi+1{bf+1bi+1)~\bJ+lbi+2) 


are  the  respective  projections  of  bi  and  6;+ 2 
bi+i-  Similarly, 


4"  *  (W11, 


onto  the  subspace  orthogonal  to  the  vector  inbetween: 


»+2 


Hence, 


pm>2  =  [(6!1))TbJ1)]-ii(6|1))rfr|;»)]((6|;21))T6S)]-^ 


represents  the  cosine  of  the  angle  between  the  respective  projections  of  6,  and  6,+2  onto  the  subspace 
orthogonal  to  the  vector  inbetween,  bi+\. 

Thus,  to  achieve  conditioning  of  B\  and  £3  with  respect  to  B2,  the  first  and  third  columns 
need  to  be  projected  onto  the  subspace  orthogonal  to  the  second  column.  Due  to  the  triangular 
structure  of  U  and  the  effect  of  the  previous  rotation  the  second  column  is  co-linear  to  e2,  and 
the  subspace  orthogonal  to  it  is  just  the  plane  (ej,e3).  The  partial  correlation  p\2  can  then  be 
determined  from  that  rotation  that  makes  co-linear  with  e3  the  projection  of  the  third  column  onto 
(ei,e3).  Since  this  rotation  takes  place  in  a  subspace  orthogonal  to  the  second  column  it  does  not 
affect  the  second  column,  and  the  zero  element  introduced  by  the  previous  rotation  is  preserved: 


/C13  0  3|3  \  /*  0  *  \  (l  11  0  0\ 

0  1  0  *  *  *  =  *  *  *  I 

\5l3  0  C13  /  \0  0  1(33/  \  *  0  *J 
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and  p\3  —  ai3.  Note  that  another  non-zero  element  is  introduced  in  the  first  column. 

Again,  because  of  the  triangular  structure  oi  U  and  the  effect  of  the  second  rotation  the  zero- 
structure  of  the  second  and  third  columns  is  the  same  save  for  one  element,  the  second  column 
is  co-linear  with  while  the  third  is  a  linear  combination  of  ti  and  e3.  Thus  the  columns  of 
the  matrix  can  be  rotated  to  yield  pi$  by  applying  a  rotation  that  makes  the  whole  third  column 
co-linear  with  e3,  and  turns  the  second  column  into  a  linear  combination  of  e2  and  e 3. 

/I  0  0  \  //„  0  0\  //„  0  0\ 

I  0  C23  -S23  I  I  *  *  *  j  =  (  ^21  ^22  0  I 

\0  S23  C23  /  \  *  0  */  \h\  hi  I33 / 

and  P23  =  ^23- 

Now  we  are  ready  to  carry  out  this  geometric  argument  in  the  general  case.  As  it  is  easier  for 
the  induction  proof  to  follow  a  sequential  order  of  elimination  the  version  of  the  algorithm  presented 
in  the  theorem  below  is  sequential. 

Theorem  4.1.  Geometric  Version  of  Theorem  3.2. 

If  the  elements  in  the  Cholesky  factor  V  of  the  sample  covariance  matrix  A  are  eliminated  in  the 
order 

/*  T  2  ...  n  —  1  \ 

*  n  2n  —  3 

*  n(n  -  l)/2 

V  * 

that  is,  proceeding  row  after  row  from  top  to  bottom,  and  within  each  row  from  left  to  right, 
then  the  sine  of  the  rotation  that  eliminates  element  (i,  i  +  k),  is  equal  to  the  sample  partial 
correlation  pi^+k,  1  <  *  <  n  —  1,  1  <  k  <  n  -  i. 

Proof.  The  proof  proceeds  by  induction. 

The  induction  basis  comprises  the  computation  of  partial  correlations  between  B\  and  all  other 
variables.  To  start  with,  the  matrix  U  is  of  the  form 


From  the  2x2  case  one  can  see  that  elimination  of  element  (1,2)  in  the  upper  triangular  matrix  U 
by  a  rotation  in  plane  (ej,e2)  provides  P12.  The  second  column  of  the  resulting  matrix  becomes 
co-linear  to  e2  while  the  first  column  becomes  a  linear  combination  of  e\  and  e2.  Hence,  there  is  a 
new  non-zero  element  in  the  first  column  and  a  zero  has  been  introduced  in  the  first  row: 


The  3x3  case  showed  that  the  correlation  p\^  between  B\  and  Bj  given  B^  could  be  computed  by 
rotating  the  first  and  third  column  and  thereby  introducing  a  non-zero  element  in  position  (3, 1) 
and  a  zero  in  position  (1,3): 


Continuing  this  argument,  the  partial  correlation  between  B\  and  given  B2,  .  . ., 

Bk  is  computed  by  performing  a  rotation  in  plane  (ei,ei +k)  thereby  creating  a  zero  element  in 
position  (1, 1  +  &),  1  <  k  <  n  —  1.  Thus,  once  all  correlations  involving  B\  have  been  computed  the 
first  column  of  the  matrix  has  totally  filled  in,  and  the  first  row  is  zero  except  for  the  first  element: 


Assume  that  the  partial  correlations  Pij+k-i  have  already  been  computed  for  1  <  t,  1  <  k- 1  < 
n  -  i.  The  corresponding  matrix  is  of  the  form 


where  Lo  is  lower  triangular  and  Uq  is  upper  triangular. 

By  induction  hypothesis  the  entire  lower  triangular  part  of  the  leading  i  -  1  columns  is  non¬ 
zero,  and  the  zth  column  has  k  -  1  non-zeros  in  its  lower  triangular  part  due  to  the  computation 
ofpi,i+l,  •••»  Pi,i+k— 1- 

In  order  to  compute  the  next  correlation  pi,i+k  the  corresponding  columns  u>,-,  ...,  tn,+fc 
of  the  current  matrix  must  be  projected  onto  a  subspace  orthogonal  to  the  subspace  spanned  by 
Bi+ 1 ,  . . . ,  Bi+k-i  (by  an  induction  on  k  along  the  same  lines  as  above  one  can  easily  show  that  /><,,+* 
represents  the  cosine  of  the  angle  between  the  respective  projections  of  6,  and  &,■+&  onto  the  subspace 
orthogonal  to  the  vectors  inbetween,  61+i  ...  bi+k-\).  Due  to  the  initial  ‘nesting’  of  the  column 
subspaces  (i.e.  the  original  upper  triangular  structure  of  U )  the  trailing  components  i  +  k,  . . . ,  n  of 
. . . ,  Wi+k-i  are  zero;  and  due  to  the  rotations  performed  in  order  to  retrieve  previous  partial 
correlations  (i.e.  the  appearing  lower  triangular  structure  of  L )  the  leading  components  1,  . . . ,  i  of 
. . .,  are  zero.  Hence  the  subspace  spanned  by  S,+i,  . . .,  Bi+k- 1  is  the  space  spanned 

by  e;+i ,  . . . ,  ,  and  the  space  orthogonal  to  it  is  the  space  spanned  by  e\ ,  . . . ,  et  ,  e,+t,  . . . ,  en. 

Similarly, components  1,  ...,  *  —  1 ,  i+k,  ...,  n  of  Wi  and  components  1,  ...,  *  —  1,  j  +  fc  +  1,  ...,  n 


of  Wi+k  are  zero;  and  the  projections  of  u >,  and  iu,+fc  onto  ej,  . . . ,  e;,  ej+jfe,  . . . ,  en  are  respectively 
co-linear  to  e,-  and  a  linear  combination  of  e,  and  6,+*.  Thus,  is  obtained  by  applying  the 

rotation  in  plane  (e;,e,+fc)  that  makes  the  projection  of  Wi+k  co-linear  with  e,-;  pij+k  is  the  sine  of 
that  rotation.  After  the  rotation  the  matrix  has  the  form 


lo 

< 

I 

H 

«’i+l,«+/k-l 

Wi+l,i+k 

■9 

tfl 

1 

*  .  ' 

\ 

M 

wM 

Wi+k- 1  ,«+*-! 

U>i+k-l,i+k 

m 

Bi 

Wi+k,i 

Wi+k,i+k 

WM 

* 

Uq  J 

According  to  Theorem  3.2,  the  elimination  can  be  carried  out  in  parallel  as  follows 

3  ...  n  -  2  n  -  1 

4  ...  n  —  1  n 

5  ...  n  n  + 1 


*  2n  -  3 

* 

If  only  the  partial  correlations  are  of  interest,  and  the  matrix  L  is  not  needed,  about  half  of 
the  arithmetic  operations  can  be  saved  by  applying  the  rotations  merely  to  the  trailing  principal 
submatrix  of  interest. 

5.  Computation  of  Arbitrary  Partial  Correlations 

Subject  to  a  certain  inital  ordering  of  the  random  variables  Bi,  . . . ,  Bn  our  algorithm  computes 
the  partial  correlations  pj,,+fc  between  Bx  and  f?I+k  given  B{+ 1,  . . . ,  by  completely  reducing 

the  upper  triangular  matrix  U  to  a  lower  triangular  matrix  L.  However,  one  may  want  to  compute 
partial  correlations  where  variables  other  than  the  ones  inbetween  are  held  fixed.  To  this  end, 
we  extend  our  notation  by  introducing  superscripts  for  the  partial  correlations  to  indicate  the 
fixed  variables:  the  old  p,,;+*  is  replaced  by  the  new  p'i+t+'k+k~1 ,  and  in  general  pf}  denotes  the 
partial  correlation  between  variables  Bi  and  Bj  with  variables  held  fixed  whose  indices  belong  to 
5  (t,  j  1 5). 

Now,  other  partial  correlations  may  be  computed  by  performing  only  a  partial  reduction.  For 
instance,  consider  the  following  6x6  example 

«11  Ul2  «13  U14  ui5  ti16 

U22  U23  U24  U25  U26 

U33  u34  U35  U36 

U44  U45  U4g 

«55  «56 

1*66 

The  leading  three  columns  of  U  span  the  subspace  of  B\ ,  B2  and  B3,  and  this  is  equal  *0  the  space 
spanned  by  the  first  three  canonical  vectors  ei,  e2  and  e3  due  to  the  triangular  structure  of  U.  The 


space  orthogonal  to  it  is  the  one  spanned  by  e*,  e5,  eg  and  is,  because  of  the  triangular  structure, 
equal  to  that  of  columns  4  through  6  of  U  with  components  1  to  3  set  to  zero.  This  means  that 
the  correlation  p\f  between  B4  and  B5,  given  B\,  B2  and  B3,  can  be  computed  by  a  rotation  of  U 
in  plane  (e4,e5).  The  resulting  matrix  has  a  new  zero  in  column  five  and  a  fill-in  in  column  four: 


U12 

Ul3 

l»14 

«15 

til6 

U22 

«23 

U24 

U25 

«26 

U33 

«34 

«35 

U36 

* 

* 

* 

* 

* 

«66 

The  next  correlation  that  can  be  computed  is  pjg3,5  with  a  rotation  in  plane  (e4,e6),  the  subspace 
orthogonal  to  B\ ,  B2,  B3  and  Bs: 


U12 

«13 

U14 

Ul5 

Ul6 

U22 

U23 

«24 

U25 

«26 

«33 

U34 

«35 

«36 

'll 

* 

* 

* 

The  last  correlation  p\$  is  determined  by  completing  the  transformation  of  the  3x3  trailing 
principal  submatrix  to  lower  triangular  form: 


“12 

«13 

“14 

“15 

“16 

“22 

“23 

“24 

“25 

“26 

“33 

“34 

“35 

“36 

*'» 

*21 

*22 

*31 

*32 

*33 

In  general,  the  correlation  p^:a',+1'3~i  for  i  >  a  and  i  <  j  <  n  can  be  determined  be  preserving 
the  leading  a  rows  and  columns  of  U  and  transforming  the  trailing  principal  submatrix  of  order  n—a 
to  lower  triangular  form  La  by  appropriate  plane  rotations: 


Similarly,  the  computation  of  P$’l:,'-1’n-/*+1:n  for  j  <  n  -  0  +  1  and  1  <  i  <  j  is  accomplished 
by  transforming  U  to  lower  triangular  form  L  (or  obtaining  directly  a  QL  factorization  of  B)  and 
then  transforming  the  leading  0  x  0  principal  submatrix  of  L  to  upper  triangular  form  Up: 


*  .  * 

*  ...  * 


Combining  the  two  above  strategies  makes  it  possible  to  determine  pjj0,’,+1  J  1,n  <3+1'n  for 
a<i<j<n-/3  +  lby  transforming  the  trailing  (n  —  a)  x  (n  —  a)  principal  submatrix  of  U  to 
block  lower  triangular  form  La  (see  the  sketch  below)  and  subsequently  transforming  the  leading 
(/3  -  a)  X  (/3  —  a)  triangular  submatrix  of  La  to  upper  triangular  form  Ua^\ 


If  it  is  known  in  advance  which  partial  correlations  are  to  be  determined  then  the  columns 
of  the  m  x  n  data  matrix  may  be  ordered  so  as  to  minimize  the  number  of  arithmetic  operations 
succeeding  the  computation  of  the  Cholesky  factor. 

For  instance,  a  lower  bound  on  the  number  of  arithmetic  operations  in  the  computation  of 
pfj,  where  5  is  a  subset  of  k  >  0  numbers  in  1  ...  n  not  containing  i  and  j,  is  0(n  -  k )  since 
our  method  requires  at  least  one  rotation  to  compute  a  partial  correlation  and  the  dimension  of 
the  space  involved  is  n  —  k.  This  lower  bound  is  attained  by  ordering  the  columns  so  that  the 
set  5  represents  the  leading  k  columns  of  the  data  matrix  followed  by  columns  B,  and  Bj.  The 
correlation  pf)  can  then  be  determined  by  one  rotation  in  the  plane  (ejt+i,ejt+2)  that,  due  to  the 
triangular  structure  of  the  Cholesky  factor,  involves  0(n  ~  k)  non-zero  element  pairs. 

Not  only  the  ordering  of  the  columns  is  important  but  also  the  sequence  in  which  particular 
correlations  are  computed.  Consider  the  computation  of  a  partial  correlation  between  two  variables 
Bi  and  B}  with  successively  more  variables  fixed:  pfj,  . . .,  pf*,  where  Si  C  ...  C  S’*  and  i,  j  & 
Sic .  It  seems  that  the  following  order  of  rotations  constitutes  the  simplest  way  of  determining  the 
above  correlations.  It  is  illustrated  by  means  of  a  5  X  5  example  for  the  computation  of  pJ2,  Pi2, 
Pi2*,  and  Pi?  -  At  first  the  columns  of  the  data  matrix  are  ordered  so  that  i  and  j  represent  the  first 
two  columns  followed  by  the  columns  of  Si,  the  columns  of  S2  —  Si,  the  columns  of  S3  -  S2  -  Si, 
etc.  In  the  example  this  amounts  to  the  ‘natural’  ordering  B1...B5  of  the  variables.  The  first 
correlation  p12  can  now  be  computed  with  one  rotation  from  the  Cholesky  factor  U.  To  compute 
p?2  columns  2  and  3  of  U  are  exchanged  and  a  rotation  in  plane  (e2,e3)  results  in  the  Cholesky 
factor  U'  corresponding  to  the  data  matrix  with  variables  in  the  order  B\,  j53,  B?,  B4,  2?5.  Since 
Bj  is  situated  between  B\  and  B2  two  rotations  suffice  for  the  computation  of  p\2.  The  effect  of 
these  steps  on  the  matrix  is  depicted  below: 

/till  «12  «13  «14  «is\  /till  tl!3  Ui2  tin  tii5\  /tin  ti!3  UU  tin  ^15  \ 

ti22  tl23  li24  ti25  U23  ti22  tl24  ti25  ti22  tij3  tin  ^25 

U33  U34  U35  — ' >  U33  U34  U35  — ►  ti^j  u'34  1/35 

U44  U45  U44  U45  U44  U45 

V  ti55  /  \  U55  /  \  US5  / 
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1 


“34  “35 

li44  «45 


U44  U45 

^55 


Similarly,  exchanging  columns  3  and  4  of  U',  performing  a  rotation  in  plane  (e3,e4)  to  get  the 
Cholesky  factor  V"  of  the  data  matrix  corresponding  to  the  ordering  B\,  B3,  B4,  B2,  B5,  and 
performing  three  more  rotations  on  V"  results  in  the  extraction  of  p^4.  In  general,  if  the  sets  5/ 
differ  by  more  than  one  index,  more  columns  of  the  Cholesky  factor  must  be  exchanged  to  ensure 
that  all  fixed  variables  are  situated  between  B,  and  Bj. 

As  for  arbitrary  sequences  of  partial  correlations,  the  determination  of  the  column  ordering  of 
the  data  matrix  as  well  as  the  computation  sequence  of  the  partial  correlations  so  as  to  minimize 
the  number  of  arithmetic  operations  seems  to  be  an  NP-complete  problem.  The  use  of  heuristics, 
such  as  the  following  greedy  approach,  might  lead  to  acceptable  operation  counts:  the  random 
variables  are  ordered  so  that  as  many  partial  correlations  as  possible  can  be  determined  from  the 
resulting  Cholesky  factor.  Repeatedly,  the  columns  of  the  Cholesky  factor  are  then  re-ordered 
according  to  the  same  strategy,  the  matrix  returned  to  upper  triangular  form,  and  appropriate 
rotations  performed  until  all  correlations  have  been  computed. 
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